14 research outputs found
Hierarchical Surface Prediction for 3D Object Reconstruction
Recently, Convolutional Neural Networks have shown promising results for 3D
geometry prediction. They can make predictions from very little input data such
as a single color image. A major limitation of such approaches is that they
only predict a coarse resolution voxel grid, which does not capture the surface
of the objects well. We propose a general framework, called hierarchical
surface prediction (HSP), which facilitates prediction of high resolution voxel
grids. The main insight is that it is sufficient to predict high resolution
voxels around the predicted surfaces. The exterior and interior of the objects
can be represented with coarse resolution voxels. Our approach is not dependent
on a specific input type. We show results for geometry prediction from color
images, depth images and shape completion from partial voxel grids. Our
analysis shows that our high resolution predictions are more accurate than low
resolution predictions.Comment: 3DV 201
3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection
Cameras are a crucial exteroceptive sensor for self-driving cars as they are
low-cost and small, provide appearance information about the environment, and
work in various weather conditions. They can be used for multiple purposes such
as visual navigation and obstacle detection. We can use a surround multi-camera
system to cover the full 360-degree field-of-view around the car. In this way,
we avoid blind spots which can otherwise lead to accidents. To minimize the
number of cameras needed for surround perception, we utilize fisheye cameras.
Consequently, standard vision pipelines for 3D mapping, visual localization,
obstacle detection, etc. need to be adapted to take full advantage of the
availability of multiple cameras rather than treat each camera individually. In
addition, processing of fisheye images has to be supported. In this paper, we
describe the camera calibration and subsequent processing pipeline for
multi-fisheye-camera systems developed as part of the V-Charge project. This
project seeks to enable automated valet parking for self-driving cars. Our
pipeline is able to precisely calibrate multi-camera systems, build sparse 3D
maps for visual navigation, visually localize the car with respect to these
maps, generate accurate dense maps, as well as detect obstacles based on
real-time depth map extraction
HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
This paper presents HITNet, a novel neural network architecture for real-time
stereo matching. Contrary to many recent neural network approaches that operate
on a full cost volume and rely on 3D convolutions, our approach does not
explicitly build a volume and instead relies on a fast multi-resolution
initialization step, differentiable 2D geometric propagation and warping
mechanisms to infer disparity hypotheses. To achieve a high level of accuracy,
our network not only geometrically reasons about disparities but also infers
slanted plane hypotheses allowing to more accurately perform geometric warping
and upsampling operations. Our architecture is inherently multi-resolution
allowing the propagation of information across different levels. Multiple
experiments prove the effectiveness of the proposed approach at a fraction of
the computation required by state-of-the-art methods. At the time of writing,
HITNet ranks 1st-3rd on all the metrics published on the ETH3D website for two
view stereo, ranks 1st on most of the metrics among all the end-to-end learning
approaches on Middlebury-v3, ranks 1st on the popular KITTI 2012 and 2015
benchmarks among the published methods faster than 100ms.Comment: The pretrained models used for submission to benchmarks and sample
evaluation scripts can be found at
https://github.com/google-research/google-research/tree/master/hitne